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Abstract 

Many stochastic systems in physics and biology are investigated by recording the two-dimensional 
(2D) positions of a moving test particle in regular time intervals. The resulting sample trajectories 
are then used to induce the properties of the underlying stochastic process. Often, it can be assumed 
a priori that the underlying discrete-time random walk model is independent from absolute position 
(homogeneity), direction (isotropy) and time (stationarity), as well as ergodic. 
In this article we first review some common statistical methods for analyzing 2D trajectories, based 
on quantities with built-in rotational invariance. We then discuss an alternative approach in which 
the two-dimensional trajectories are reduced to one dimension by projection onto an arbitrary axis 
and rotational averaging. Each step of the resulting ID trajectory is further factorized into sign 
and magnitude. The statistical properties of the signs and magnitudes are mathematically related 
to those of the step lengths and turning angles of the original 2D trajectories, demonstrating that 
no essential information is lost by this data reduction. The resulting binary sequence of signs lends 
itself for a pattern counting analysis, revealing temporal properties of the random process that are 
not easily deduced from conventional measures such as the velocity autocorrelation function. 
In order to highlight this simplified ID description, we apply it to a 2D random walk with restricted 
turning angles (RTA model), defined by a finite- variance distribution p{L) of step length and a 
narrow turning angle distribution p((/>), assuming that the lengths and directions of the steps are 
independent. 
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QUANTIFYING 2D TRAJECTORIES 



We consider a measured trajectory (Fig.l) that consists of A'"+l discrete two-dimensional 
points Rf — {xf, Ut) with t — . . . N, sampled in equal time intervals Strec- In the following, it 
is implicitly assumed that all absolute times t and lag times At are in units of this sampling 
interval. In a spatially homogeneous system, the absolute positions Rt are of no importance 
by themselves. All relevant information about the walk is contained in the N steps Ut = 
Rt — Rt-i, or the corresponding velocities Vt — Ut/ Strec- These steps can be described in 
Cartesian or polar coordinate systems, Ut — {Axf, Ay^) = (L^ cos (ft, Lt sin (ft) — LtCt, where 
et is a unit direction vector. The angle (pt — ft — ft-i between two successive step vectors is 
called the turning angle. In an isotropic system, the absolute step directions, measured by 
the angles ift, cannot be of importance as well. The only relevant information of a trajectory 
is therefore contained in the set {{Lt, (pt) '■ t = 1 . . . N} of subsequent step lengths and turning 
angles. In the most general case, the underlying discrete-time random walk model has to 
determine the combined probability density p (Li, 0i, iv2, 02, ■ ■ ■ , -^at, 0iv)- In a stationary 
random process, the stochastic properties can only depend on the differences between the 
time indices. A stationary walk is therefore described by the combined probability density 
p (. . . , Lt-i, (f)t-i, Lt, (f)t, Lt+i,(j)t+i, . . .). 



AGGREGATED STATISTICAL PROPERTIES 

The aggregated statistical properties of the system are extracted by computing suitable 
averages. Because of the stationarity and ergodicity of the random process, we can replace 
ensemble averages (over different trajectories) by time averages. In the following we denote 
the average of a quantity ft over all absolute time points as {ft)^ — ^ — _^ . 5^^^.^ ft- 

It is important to keep in mind that, in general, a finite trajectory does not show all 
the symmetries of the underlying random process. For example, when analyzing relatively 
short trajectories with directional persistence, it may happen that all step directions fall 
into a narrow range of absolute angles (ft- This can cause artifacts, such as significantly 
different distributions p(Ax) and p(Ay) of the Cartesian step components. To avoid such 
problems, one strategy is to use only quantities that are, by definition, invariant with respect 
to translations and rotations of the trajectories, such as the step length Lt and the turning 
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angles 0t. 



DISTRIBUTIONS AND CORRELATION PROPERTIES OF STEP LENGTH 
AND TURNING ANGLES 

The probability distributions of the step lengths and turning angles can be expressed 
formally as 

p{L) = {6(1 - L,)), 

p{ct>)^{5{(t>-(t>t)\. (1) 

Prom them follow the mean value (denoted by L and (j) ) and the variances (denoted by 
(j\ and (t|). In this paper, we assume that the L and a\ are finite, excluding random walks 
with a heavy-tailed step length distribution such as the Levy-flight. 

Besides these distributions, one should take into account possible temporal correlations 
of these quantities as well. The normalized autocorrelation functions of L and (and the 
cross correlation function, respectively) are defined as 

CWr)^{{Lt-L){Lt+r-L))J<yl 

ClA-t) = {{U-L){ct>t+r-4>))J{<yLa^)- (2) 

We note that in most "standard models" of random walks, such as the discrctc-time 
correlated random walk, L and (f) arc drawn independently from fixed distributions p{L) 
and p(0), so that Cll{t) = Crj,^{T) — 5rfi and Cl<i,{t) = 0. However, some more complex 
stochastic system show "super-statistical" effects, such as temporally correlated fluctuations 
of the step length, Cll{t > 0) ^ 0, or a time-dependent variance (j| = f{t) of the turning 
angle. 
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VECTORIAL VELOCITY AUTOCORRELATION FUNCTION AND MEAN 
SQUARED DISPLACEMENT 



In contrast to the step vectors Ut themselves, certain combinations such as the dot product 
UfUt+T are translational and rotational invariant. Therefore, a frequently used measure for 
the temporal structure of a random walk is the vectorial velocity autocorrelation (VAC) 
function: 



Another popular quantity with translational and rotational invariance is the Mean 
Squared Displacement (MSD): 



The MSD is mathematically related to the VAC by 

mr)^4f^{r-t)C^u{t). (5) 

t=~T 

Prom a practical point of view, the MSD has the advantage to be less sensitive to statis- 
tical noise, due to the summation. Also, while the normalized VAC always starts with the 
value 1 at lagtime zero, the MSD shows explicitly the scale of the displacements. Note that 

the scale factor a'i = {L'^)^ — L only depends on the first and second moment of the step 
length distribution. The shape of the MSD, on the other hand, is entirely determined by 
the VAC. 

It is worthwhile to consider which properties of the trajectory are responsible for this 
shape: For a trajectory without drift {u — 0), the VAC depends only on the expression 
{utUt+r)t = {LtLt+T cos{(pt+T — ft))f Consider first the case when step lengths and step di- 
rections arc statistically independent. Then {utUt+r)t factorizes as (LfLt^^ cos{(f t+r — Vt))t = 
{LtLf+j.)^ ■ {cos{ipt+T — Vt))f The first factor describes possible correlations between succes- 
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sive step lengths. However, the expression (LtLt+^)^ = L -|- (ALtALt+,-)j is always non-equal 
zero, even if the step lengths fluctuations are mutually uncorrelated. The second factor de- 
scribes directional correlations and it can be zero. If it is zero, the VAC is 5-correlated 
and the MSD increases linearly with lagtime, indicating trivial diffusive behavior Conse- 



Cuu{r) ^{{ut- u){ut+T - u))^ /al. 



(3) 




(4) 
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quently, any non-trivial lagtime-dependence of the VAC/MSD (such as two distinct lagtime 
regimes, sub-diffusive, or super-diffusive behavior) can only arise if directional correlations 
are present. In this case, correlated step length correlations may have an additional effect 
on the shape of the VAC/MSD. The most general case would even include cross-correlations 
between step lengths and step directions. 

THE PROJECTED TRAJECTORY AND ROTATIONAL AVERAGING 

Next we consider aggregated statistical properties based on quantities without built-in 
rotational invariance. In particular, we analyze the projection of the 2D trajectory onto 
some axis, for example the x-axis of the coordinate system. By the projection, the sequence 
of vectorial steps Ut is reduced to a sequence of scalar steps Axj, so that some directional 
information is lost. However, we can define for the ID trajectory are pair of quantities 
equivalent to the step length and the step direction, by factorizing each scalar step into a 
magnitude and a sign factor: 



We can compute the distributions p{m) and p{s) of the magnitudes and signs by temporal 
averaging over the projected trajectory. However, in order to avoid the above-mentioned 

artifacts related to the finite number of steps, we have to additionally perform a rotational 
averaging over the absolute direction angles tp of each step: 



^xt = mf St 



Axtl • sgn{Axt) 




(6) 




(7) 



Using this notation, we can write 



p{m) 



{{8{m 



'rnt))t) 




(8) 
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We next consider how the distribution p(m) is related to p{L). If a single vectorial 
trajectory step u of step length L is isotropically rotated, it produces a whole distribution 
p{m, L) of projected magnitudes, ranging from m — Q to m — L: 



p{m, L) 



2 e{m) e{L-m) 



(9) 



'K^/L'^—vn? 



where di) indicates the Heaviside step function. Therefore, a given step length distribu- 
tion p{L) produces a corresponding distribution of magnitudes that is given by 



The quantity in ID that corresponds to the turning angle distribution p{(j)) in 2D is the 
probability q — Prob{" St+i — St") that two successive scalar steps have the same signs. The 
probability q can also be called the persistence parameter of a trajectory, since q — 1/2 
indicates non-persistent behavior, while g- values smaller (larger) than 1/2 indicate sub- 
diffusive (super- diffusive) behavior In order to derive a relation between p(0) and q, we 
consider a sequence of two vectorial steps Ui and U2, enclosing a turning angle (compare 
Fig.2) with probability p{4>)d4>. Assume that initially the two sign factors si and S2 are both 
positive. If we now gradually increase 0, there is a critical turning angle (pdf) at which 
si = s{(fi) — sgn{cos{(f)) becomes different from S2 = s{(fi + (p) — sgn{cos{(p + (p)). We can 
therefore express g as an angular integral over p{(f)): 



The fact that p{(f)) is a distribution, whereas q is just a number, demonstrates the infor- 
mation loss associated with the projection from 2D to ID. This missing information is the 
detailed shape of the turning angle distribution, which in many cases will not be of partic- 
ular interest. In this sense, the ID projection of a 2D trajectory with rotational averaging 
represents a useful simplification and helps to filter out the essential information. 




(10) 



Q = 




(11) 
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Finally, the temporal structure of the projected trajectory can be analyzed with the 
rotationally averaged autocorrelation function of the scalar steps: 

(:^A.A.(r) = {{{Axt - Mi^^t+r - M)t/<^D^ ■ (12) 
The correlation functions C^ot(t), Css{t) and Ctos(t) are defined in an analogous way. 

DISCRETE PATTERN STATISTICS AND THE PERSISTENT MARKOV CHAIN 
OF SIGNS 

By the above procedure, we have mapped an originally two-dimensional trajectory onto 
two scalar time series, rrit and St- Since the signs St are binary variables, we can apply to 
them analysis tools that are tailor-made for discrete random processes. In particular, we can 
count the frequency of patterns, such as "-- 1— " , within the time series. Once the probabilities 
for all patterns of a given length are known, it is straight forward to construct a higher order 
Markov model that replicates the statistical properties of a measured time series. 

The principles of pattern statistics can be demonstrated with a simple binary Markov 
chain st'. At i = it starts randomly with "-" or (equal probability). For each sub- 
sequent time, Prob{"st+i — Sf") — q, with a pre-defined persistence parameter. What is 
the frequency distribution for a given pattern, such as "—I—", in this model ? In this par- 
ticular case, p(" - + - ") = p{" - ")p{" - " ^ " + ")P(" + " ->"-"), which yields 
p(" — I — ") = (1/2) (l — q) {1 — q). For reasons of symmetry, the probability of any pattern 
is equal to that of its inverse, where all and "-" are exchanged. Also, we can tem- 
porally reverse a pattern without changing its probability. Thus, there are only 3 distinct 

patterns of length 3, and their relative frequencies in our model are: p(" ") oc q^, 

p(" h ") C)C q{l — q) and p{" — I — ") oc (1 — q^. They all become equally frequent for 

the non-persistent case q = 1/2. 

THE RESTRICTED TURNING ANGLE (RTA) MODEL 

We next consider the class of 2D random walk models in which the step lengths L and 
step directions (f are statistically independent, {Lt(fit+T)t — 0. The step lengths have a 
fixed distribution p{L) with finite values for mean L and variance a^. The turning angles 
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also have a fixed distribution p{(f)), with zero mean and variance a^. In particular, we are 
interested in the case where is rather narrow (restricted turning angles) , so that the walk 
has directional persistence. We call this case the RTA model in the following. 

It is straight forward to show that under the given assumptions the vectorial velocity 
autocorrelation functions is given as 



^uuir ^ tree) 



L +aU 



L"nO 



(13) 



The directional correlation factor can be expressed as an integral over turning angles: 



/+7r 
d(t) p{(j), n) cos(0) 
-TT 



(14) 



Here, p(0, n) is the probability density that a vectorial step and its nth successor enclose 
a turning angle 0. It is clear that n = 0) = 5(0 — 0), that p{(p,n = 1) = p(0) is just 
the prescribed turning angle distribution and that p{(f),n — )■ oo) — ?■ l/(27r). The temporal 
development of p{(f), n) corresponds to a kind of diffusion process on the unit circle. As long 
as the width of p(0, n) is smaller than 27r, we can view the process as a diffusion on a linear 
0-axis. Then, p(0, n) is just the n-fold convolution of the turning angle distribution p(0) with 
itself. For lagtimes 1 <^ n <^ fimax — '^'^1 '\J~^i the distributions p(0, n) resemble normalized 
Gaussians with zero mean and a lagtime-dependent variance <j\{n) = n ■ aj,. We insert the 



approximation p(0, n) 



g-(i/2)</./a2(„) .^^^ Eq.[14] and obtain analytically 



{cos(ipt+n - 'Pt))t ~ e 



-(^1/2) n 



(15) 



Summing up, the velocity autocorrelation function in the RTA model will show a sudden 
drop between lagtimes and 1. This drop occurs because the variance of the uncorrelated 
step lengths Lf contributes only to the total velocity variance (n = 0). For intermediate 

lagtimes in the regime 1 <^ n <^ Umax = 27r/ \J~^ it will decay exponentially with a charac- 
teristic decay time inversely proportional to the variance of the turning angle distribution: 



a 



L^nO 



-('^1/2) n 



(16) 
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SIMULATION OF RTA MODEL 



For a concrete example, we consider Rayleigh-distributed step lengths with the most 
probable value Lp, p{L > 0) = {L/L'^) e~^'^/'^'^^i\ so that L = ^Jt^L^ and a\ = 
^^Lp. The turning angles are assumed to be equally distributed within a narrow in- 
terval [— 0maa: • • • + 0max-], Corresponding to a variance cr^ = ^(p'^ax- Note that this model 
has only the two parameters Lp and (pmax- 

For the numerical simulation of the RTA model we set Lp — l.Olu, with an arbitrary length 
unit lu, and (pmax — 7r/20. The recording time interval is set 5rec = 1- A segment of a typical 
trajectory is shown in Fig.3. The numerically calculated step width distributions, turning 
angle distributions and velocity autocorrelation function, together with the analytical results, 
are shown in Figs. 4-5. 

ID-PROJECTION OF RTA MODEL 

Using the transformation formula Eq.lO, we obtain for the ID projection of the RTA 
model with Rayleigh-distributed step lengths a Gaussian magnitude distribution p{m > 
0) — ^ Q-'J-I^pf 1"^ (see Fig.6). According to Eq.ll, we expect a persistence parameter 
q — \ _ ^1^, yielding q — 0.975 for a maximum turning angle (^max — 7r/20. By counting 
the fraction of pairs of identical signs in the simulated time series St from the projected RTA 
model, we obtain q = 0.974. 

The autocorrelation function of St is shown in Fig. 7. It is interesting to compare it with 
Cssin) for a persistent Markov chain of signs with the same average g, which decays like 
Cssii^) — (2g — l)** for g > 1/2. Note that the two models agree only for n = (automatic 
due to normalized autocorrelation function) and n — 1 (correct average fraction of equal 
sign pairs). For larger lagtime, the sign correlations in the projected RTA model decay more 
slowly than in the Markov chain. 

The origin of these deviations are "higher order" correlations in the projected RTA model, 
beyond a simple Markov approximation. Consider the projection of the RTA-trajectory onto 
the X-axis. As long as the step direction is roughly parallel to the x-axis, the time series St 
shows long sequences of identical signs, such as " -|— I— I— I— I— I— 1-" , what could also be called a 
"bunching" of equal sign pairs. But occasionally, the direction diffuses into a close to vertical 
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position. During such phases, St shows sequences such as "H — I — \—\ — I — h" , corresponding to 
an " anti- bunching" of equal sign pairs. Thus, while the average fraction of equal sign pairs 
agrees in both models, they are spread over the time axis in a different way in the projected 
RTA model. These differences are also reflected in the pattern statistics: For example, in 
the projected RTA model (Fig.9), the fraction of " — h" patterns is significantly diminished, 
compared to the Markov chain (Fig.8). 

A "MOMENTARY PERSISTENCE" VARIABLE AND ITS TEMPORAL COR- 
RELATIONS 

The temporal distribution of equal sign pairs can be investigated by defining a momentary 
persistence variable, 

Vt = Ss,_^,sf (17) 

The global persistence parameter q = {rjt)^ is just the time average of this variable. In 
a persistent Markov chain of signs, the random variable rjt behaves like a Bernoulli process 
with the probability q for the event 1 and 1 — g for the event 0. For such a white noise 
process, the autocorrelation function yields C^r?(''') — ^t,o- In the projected RTA model, 
however, the momentary persistence should have some memory time larger than zero. This 
is indeed the shown in Fig. 10. 
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FIG. 2. Two successive vectorial steps, their projections Axt onto the x-axis, and the signs st of 
the steps. 
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FIG. 3. Segment of a trajectory in the RTA model. 
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L (lu) c/)/7r (1) 

FIG. 4. Simulated (symbols) and analytic (black lines) distributions of step lengths (left) and 
turning angles (right) in the RTA-model. 
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FIG. 5. Vectorial velocity autocorrelation function in the RTA-model, comparing simulation (red 
symbols) with analytical approximation (black line). 
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FIG. 6. Simulated (symbols) and analytic (black line) distribution of step magnitude in the 
projected RTA-model. 
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FIG. 7. Autocorrelation functions of sign factors in the projected RTA-model (red), and in a 
persistent Markov chain of signs (black). Note that models agree only at lagtimes and 1. 



17 




FIG. 8. Logarithmic frequency of selected patterns in a persistent Markov chain of signs St, for 
q = 0.975. 
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FIG. 9. Logarithmic frequency of selected patterns in the projected RTA model. 
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FIG. 10. Autocorrelation function of the momentary persistence rjt in the projected RTA model 
(red) and in a persistent Markov chain of signs with the same q (black). 
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